-
Notifications
You must be signed in to change notification settings - Fork 208
JetKVM Advanced, CGO-based 2-way Audio Support #718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
db2d107
to
4f47d62
Compare
Great news! I'll soon update this PR with Audio Input pass-through functionality too |
c9f4aea
to
3444607
Compare
…g audio input issues
…udio system for easy debugging and troubleshooting
…udio system for easy debugging and troubleshooting
… of if/else for better readability
Implement SIMD-optimized audio operations using ARM NEON for Cortex-A7 targets Update Makefile and CI configuration to support NEON compilation flags Add SIMD implementations for common audio operations including: - Sample clearing and interleaving - Volume scaling and format conversion - Channel manipulation and balance adjustment - Endianness swapping and prefetching
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still review... but need some clarity on what I'm looking at here... it LOOKS like you're spinning up an entire second copy of the device-side GO application and running IPC between them... that seems REALLY fragile and wasteful. Can't we just do all the processing and relaying withing go routines?
try { | ||
if (isMuted) { | ||
// Unmute: Start audio output process and notify backend | ||
const resp = await api.POST("/audio/mute", { muted: false }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this cross to the device even when running on app.jetkvm.com?
I would have expected this sort of communication to cross over RPC
@@ -795,7 +825,7 @@ export const useMacrosStore = create<MacrosState>((set, get) => ({ | |||
|
|||
const { sendFn } = get(); | |||
if (!sendFn) { | |||
console.warn("JSON-RPC send function not available."); | |||
// console.warn("JSON-RPC send function not available."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why commented out? Why not deleted?
} catch (error) { | ||
console.error("Failed to load macros:", error); | ||
} catch { | ||
// console.error("Failed to load macros:", _error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why commented out? Why not deleted?
Summary
This PR introduces comprehensive audio support to JetKVM for the first time (See #315). Audio is now bidirectional, enabling both listening to the remote device's audio output and sending microphone input from your browser to the managed device. This functionality leverages JetKVM's USB Gadget capabilities, where the device poses as both an audio output device and a microphone to the managed system via the USB connection. Audio is captured directly from the managed device (via ALSA), encoded using Opus (via CGO), and streamed in real time to the user's browser using WebRTC. Additionally, microphone input from the browser is received via WebRTC, decoded, and played back through the managed device's USB gadget audio interface.
JetKVM Audio Architecture
Overview
JetKVM implements a sophisticated dual-subprocess audio architecture that provides bidirectional audio streaming between the remote PC and browser. The system uses separate dedicated Go subprocesses for audio output and audio input processing, with the main process handling WebRTC communication and session management. This architecture ensures complete isolation of audio processing from KVM operations, providing optimal performance and stability.
Key Architecture Features:
Architecture Components
1. Main Process
/audio/*
,/microphone/*
)/webrtc/signaling/client
AudioServerSupervisor
andAudioInputSupervisor
for subprocess managementaudio.SessionProvider
interface viaKVMSessionProvider
WriteOpusFrame()
andWriteOpusFrameZeroCopy()
methodsAudioInputMetrics
audio.StartMetricsUpdater()
for performance monitoring2. Audio Output Server Subprocess
cgoAudioInit()
andcgoAudioReadEncode()
hw:1,0
) for output capture/var/run/audio_output.sock
)StartNonBlockingAudioStreaming()
--audio-output-server
command line flag, detected viaos.Args
parsingJETKVM_OPUS_*
variables)3. Audio Input Server Subprocess
cgoAudioPlaybackInit()
andcgoAudioDecodeWrite()
hw:1,0
) for input playback with fallback todefault
/var/run/audio_input.sock
)--audio-input-server
command line flag, detected viaos.Args
parsingNewAudioInputServer()
with graceful shutdown handling and triple-goroutine designCGOAudioPlaybackInit()
for ALSA and Opus decoder initializationJETKVM_AUDIO_INPUT_IPC
andJETKVM_OPUS_*
variables for configuration4. Inter-Process Communication
UnifiedAudioServer
andUnifiedAudioClient
components/var/run/audio_output.sock
for output frame transmission from subprocess to main process/var/run/audio_input.sock
for input frame transmission from main process to subprocessAudioOutputSupervisor
andAudioInputSupervisor
AudioOutputClient
andAudioInputClient
handle connection management and frame transmissionsync.Pool
for reduced GC pressureUnifiedIPCOpusConfig
Frontend Components
Audio Control Interface
WebRTC Integration
Start()
lifecycle managementrelayLoop()
with efficient frame forwardingSetMuted()
with immediate effectforwardToWebRTC()
methodReact Hooks Integration
audio-mute-changed
,audio-metrics-update
,microphone-state-changed
,process-metrics
)Microphone Management
WriteOpusFrame()
and zero-copyWriteOpusFrameZeroCopy()
operationsIsRunning()
method with thread-safe operationsGetMetrics()
for monitoring and optimizationAudio Output Flow (Remote PC → Browser)
The audio output flow captures audio from the remote PC and streams it to the browser through the following pipeline:
Audio Data Flow with Optimization Points
Comprehensive Audio Data Path Diagram
Optimization Legend
Performance Targets
Simplified High-Level Flow
Audio Input Flow (Browser → Remote PC)
Detailed Output Pipeline
hw:1,0
)cgoAudioReadEncode()
jetkvm_audio_read_encode()
functionAudioOutputServer
AudioRelay
receives frames from IPC socket viarelayLoop()
and forwards to WebRTC usingAudioTrackWriter
interfaceforwardToWebRTC()
methodSetMuted()
method with real-time state managementAudio Input Flow (Browser → Remote PC)
Detailed Input Pipeline
AudioInputSupervisor
with connection health monitoring/var/run/audio_input.sock
to input subprocess usingAudioInputClient
with automatic reconnectionNewAudioInputServer()
processes frames:cgoAudioDecodeWrite()
and ALSA playbackjetkvm_audio_decode_write()
C functionhw:1,0
) for remote PC microphone inputSendOpusConfig()
IPC messagesProcess Architecture
Main Process Responsibilities
/audio/*
,/microphone/*
) via Gin web frameworkKVMSessionProvider
interfaceAudioRelay
forwards frames from output subprocess IPC to WebRTC tracksAudioInputIPCManager
manages IPC communication with input subprocessProcessMonitor
tracks subprocess CPU/memory usageAudio Output Server Subprocess Responsibilities
hw:1,0
) for remote PC audio viacgoAudioInit()
cgoAudioReadEncode()
with C-based ALSA/Opus integration/var/run/audio_output.sock
JETKVM_OPUS_*
)Audio Input Server Subprocess Responsibilities
hw:1,0
with fallback todefault
) viacgoAudioPlaybackInit()
cgoAudioDecodeWrite()
with C-based ALSA/Opus integration/var/run/audio_input.sock
--audio-input-server
command line flag detectionBuild & Tooling
setup_toolchain
,build_audio_deps
,dev_env
)tools/
for idempotent setup of cross-compiler and static ALSA/Opus librariesFrontend Integration
Documentation
Disclaimer
Credits
Thanks!
Alex